167 results found.
Speech
Corpus,
Language Type:
Monolingual
Languages:
French
Availability:
From Data Center(s)
License:
Size:
12 hours Production Status:
Existing-used
Use:
Information Extraction, Information Retrieval
-
Paper title:Very Short-term Conflict Intensity Estimation Using Fisher Vectors
-
Paper track:3.6 Social signal processing/Poster Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Gábor Gosztolya | SSPNet Mobile Corpus | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese Dutch Finnish French German Greek Hungarian Japanese Russian Spanish
Availability:
Freely Available
License:
Apache-2.0
Size:
None Production Status:
Existing-used
Use:
Speech Synthesis
-
Paper title:One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech
-
Paper track:7.14 Cross-lingual and multilingual aspects in spe/Poster Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Tomáš Nekvinda | CSS10 | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese Dutch French German Russian
Availability:
Freely Available
License:
Creative Commons CC0
Size:
None Production Status:
Existing-updated
Use:
Speech Synthesis
-
Paper title:One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech
-
Paper track:7.14 Cross-lingual and multilingual aspects in spe/Poster Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Tomáš Nekvinda | Cleaned Common Voice | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Dari/Pashto Dutch English Finnish French Hindi Icelandic Indonesian Japanese Lithuanian Malay Mandarin Nepali Portuguese Punjabi Romanian Slovenian Spanish
Availability:
From Owner
License:
CreativeCommons
Size:
467 hours Production Status:
Newly created-finished
Use:
Person Identification
-
Paper title:JukeBox: A Multilingual Singer Recognition Dataset
-
Paper track:4.3 Speaker verification and identification/Oral Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Anurag Chowdhury | JukeBox | /N |
Documentation:
Documentation in English language will be made available upon publication of the dataset.
pos-tagging
Bilingual corpora from Europarl (Koehn, 2005),
Language Type:
Multilingual
Languages:
English French German
Availability:
License:
Size:
2M tokens Production Status:
Use:
Machine Translation, contratsive analysis
-
Paper title:The Learnability of the Annotated Input in NMT Replicating (Vanmassenhove and Way, 2018) with OpenNMT
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Nicolas Ballier | Annotated Europarl | /N |
Documentation:
Koehn 2005 paper
Written
Terminology,
Language Type:
Multilingual
Languages:
Arabic Dutch English French German Modern Greek Russian Spanish
Availability:
Freely Available
License:
Size:
4473 concepts Production Status:
Existing-updated
Use:
Acquisition
-
Paper title:Representing Multiword Term Variation in a Terminological Knowledge Base: a Corpus-Based Study
-
Paper track:Terminology/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Pilar León-Araúz | EcoLexicon | /N |
Documentation:
https://ecolexicon.ugr.es/en/manual.htm
Written
Corpus,
Language Type:
Bilingual
Languages:
English French
Availability:
Freely Available
License:
CC
Size:
342 problems OtherProduction Status:
Newly created-finished
Use:
Textual Entailment and Paraphrasing
-
Paper title:A French Version of the FraCaS Test Suite
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Maxime Amblard | French FraCas | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
French
Availability:
Freely Available
License:
CreativeCommons
Size:
300 hours Production Status:
Newly created-finished
Use:
Language Modelling
-
Paper title:Rhythmic Proximity Between Natives And Learners Of French - Evaluation of a metric based on the CEFC corpus
-
Paper track:Speech/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Sylvain Coulange | Corpus d'étude pour le français contemporain | /N |
Documentation:
Documentation here https://www.projet-orfeo.fr/
Written
Corpus,
Language Type:
Multilingual
Languages:
Chinese English French German Japanese Korean Russian Spanish
Availability:
Freely Available
License:
CC-BY-4
Size:
68000000 sentences Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:ParaPat: The Multi-Million Sentences Parallel Corpus of Patents Abstracts
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Felipe Soares | ParaPat | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
French
Availability:
From Owner
License:
Size:
446000 words Production Status:
Newly created-in progress
Use:
Document Classification, Text categorisation
-
Paper title:Age Recommendation for Texts
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Gwénolé Lecorvé | TextToKids corpus | /N |
Documentation:
None




